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New ability estimators have been proposed by Wainer and Wright (1980) 
and Mislevy and Bock (1981), that are resistant against guessing and 
careless behaviors exhibited by some examinees. This paper presents 
another class of ability estimators that are resistant to departures 
from the underlying assumptions concerning guessing and carelessness. 
In addition to computing the asymptotic relative efficiency of such 
estimators, this paper e^a^uates estimators by comparing their 
influence curves (Ruber ,^ 1.981). 
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1. Introduction 

New ability estimators have been proposed by Wainer and Wright 
(1980) and Mislevy and Bock (1981), that are resistant against 
guessing and careless behavior exhibited by some examinees. This 
paper presents another class of ability estimators that are resistant 
to departures from the underlying assumptions concerning guessing 
and carelessness. In addition to computing the asymptotic relative 
efficiency of such estimators, this paper evaluates the evStimators 
by comparing their influence curves (Hubej:, 1981). 

It is of some importance to note two difficulties that have 
had to be overcome in the deri/ation of the asymptotic behavior of 
the estimators. The first is that the desired results do not follow 
directly from those for maximum likelihood estimators since the new 
class of estimators include some estimators that are not maximum 
likelihood. In fact, the asymptotic behavior has been derived from 
first principles. The second difficulty is that item responses are 
not identically distributed random variables when the items differ 
in difficulty, discrimination, or guessing characteristics. This 
has been overcome by assuming the i terns to be randomly sampled from 
a parent population. 

2 , Definition and Motivation of a New Class of Estimators 

Let x^, . . . , be independent dichotomous item responses such 

that for a candidate with ability T, a real value parameter, 
Pr(X.=x.|T) - [P.(T)] ^[l^P.(T)] ^,x. - 1 or 0 where the P.(-)'s, 
called item response curves, are possibly different mappings from 
the real line to the unit interval [0, 1]. Fbr convenience the 
subscript i will be deleted. 
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The proposed estimator of T is defined as the solution to the 
equation - 

Ea[x - P(t)][P(T)Q(T)]^^^=0 ^ (^-D 

/ 

where the sum is over all items, Q=l-P, the a^s are given but 
possibly different constants, and h is a real number greater than 
or equal to 1. In' the foregoing we refer to these estimators as 
h-es timators . 

The value of h is chosen according to how much robustness is 
desired; the greater the value of h, the more robust the estimator 
is. Guidelines for the choice of h depend on the value of the 
estimator's asymptotic variance that can be tolerated in order to 
reduce the influence of individual responses on the estimator. More 
discussion on this topic will follow in the next several sections, ; 

Under certain circumstances, h-estimators correspond to maximum 
likelihood estimators (mle's). If, in addition to the above assump- 
tions, dP(t)/dt = P(t) exists for each item, the mle is the solution 
to 

•;P(r)[x-P(T)][P(T)Q(T)"^=0 . 

Furthermore, if all.P satisfy the two-parameter logistic model: 

In[P(t)/Q(t) ] - a(t-b), • 

«- 

then 

P(t) - aP(t)Q(t) . 

So for h = 1, the h~estimator is the mle for the two-parameter logistic model, 

Both h-estimators and mle^s are special cases of more general kinds of 
estimators: those that are solutions to * 

w.(T)[x-~P(T)l = 0, 
iur some fuuotion w. 
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For various kinds of w-functions5 we have: 

1) An h-estimator is the special case 

wCt) = a[P(t:)Q(t:)]^~^' 

2) Au mle arises when 

. w(t:) - dln[P(t:)/Q(t:)]/dt:. 

We will denote the weight functions of the h-estimators by 

w(t;a,h) = a[P(t)Q(t)]^^^ (2.2) 

A priori, h~estimators with reasonable weight functions should 
possess good robustness properties. First, they should not be overly 
influenced by any one item response and second, they should be stable 
when the true model for response departs from the assumed mode for 
response. Reasons for these assertions are discussed as foljlows. 
h-estimators should resist the influence of single outlying responses 

Suppose a new item is administered and that h is greater than one. 
If T , based on n responses, is already such that ^^^j^C^^) near 0, 

but X . =1, then [x . -P , , (T) ]w (T ; a ,h) is a relatively small contri- 
n+l n+1 n+1 n+1 

bution to the sum in (2.1) defining the new estimator T^^^. So the new 

observation will not dramatically change the old estimate of T. A similar 

finding holds for P , ^ (T ) near 1, but x , = 0, since the sum- in (2.1) 
^ n+l n n+1 

is symmetric in the value ^^^-j.* 

Example 1 Suppose the model for item response is the logistic model 
with b. =-0.8 to 1.0 by steps of 0.2 and a^=l. Table 1 displays the 
various values of 'the h--estimator for two item response sequences: one 
without an outlier and one with an outlier. The h-estimators are less 
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affected by the outlier than the mle , (h=l) . Further, the effect 
is smaller, the larger the value of h. 

h -es timators should be insensitive to departures from the model 

Suppose the true model is P'^ unequal to P for each item. 
One could retain the old weight, function so that the estimator 
remains resistant to outliers, but solve 

^[x-P^(T)] w'{T;a,h)= 0 ^ (2.3) 

for a reasonable estimator of ability. The equation (2.1) is 
equal to the above equation plus the term 

[P'^(T)-P(T) ] w(T;a,h) 
added to 'its right hand side. If this term is small, then .solutions 
to (2.1) are close to solutions to (2.3). Since this term piets smaller 
as h gets la->rger, we expect h-estimacors to be robust to departures 
from the model. 

In fact, the property described is precisely continuity of the 

(estimator \vaen viewed as a function' of the P.*s. We show in the 

1 

foregoing that h-estimators are continuous functions under the 
proper mathematical setting. 
^ • ^1!? e Influenc e Function and Other Heuristics 

The influence function is a useful tool in robust statistics. 
Not only does it allow the evaluation of the influence of outliers 
on the estimator under investigation, it also allows a heuristic 
derivation of the limiting Taw o^f its sample distribution function 
(Hampel, 1968,- 1974: Ruber, 1981). Of course, the resul,t must be 
checked with a rigorous mathematical proof. 
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The Influence Function of the h-es timato r 

For general statistical estimators, the derivation of the 

influence function is facilitated by viewing the estimator as a 

function of the probability distribution function F of the ob,serva- 

tions. Then the influence function is derived from the Gateaux 

dei^ivative response (Ruber, 1981) of the estimator function, T(F) , 

at |a distribution function F in the direction G: 

T(G;F) = lim T( (l^s) F+sG)-T(F) . 

s->-o s , . 

The last expression is the ordinary derivative of T((l-s)F+sG) 

with respect to s. (Further discussion and references will be found 

in Ruber (19,81) pages 13 and 37.) The influence function at z is 

! ^ d 

the Gateaux derivative in the direction, G(z) = d(z', z ), which is 

o 

the ^oint-mass at z . 

o 

The difficulty in determining the influence function of item 
ability estimators lies in representing them as functions of the 
probability distribution functions of the observations. The defining 
equation (2..1) relates each value of the estimator to every set of 
values of item responses, item resnon^p rurve.q, nnH constants: 

z = (x , P , a ) , i=l, ...5 n. Denote the point-mass at z. 
i i 1 1 1 

as above, and define the empirical distribution function as 

F (z)=Zd(z, z,)/n. Clearly, the estimator 'defined in (2.1) depends 
n 1 

on F since (2.1) is equivalent to 
n 

/[x-P(T)] w(T;a,h)dF =0. (3.1) 

u 

ii 
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Denote thig^ dependence by the functional notation T=^T(F ), 

n 

The function T(0 will ^^e extended to any probability mass 
function F by replacing F by F in (3.1). 

n ' a 

/[x'-PCTCF))] w(T(F);a,h)dF=:0. ^ ' (3.2) 

Since the functional notation is defined implicitly, 
substitution of (1-s) F+sG fo'r F in (3.2) and chain- ru4e-x^ 
differentiation with respect to s vields an equation involv- . 
lag the Gateaux derivative^: 



+ /[x-P(T) ] w(T;a,h) (aG-dF)-O, 
where T-^T(F) , Because (3.2) Is satisfied, letting (:-d(/.,/..) 
in' (3.3) yields the following influence function: 

[x.-P.(T)],r (T,a.,h) 

TC (z ;F,1) - — ^.^-^-^.^ 

. ^ d /[x-P(t) ] w(t;a,h) dF 

dt |t-T 

(Tlie notation IC refers to Influence Curve). ^^Hiere the suh-^ 
script for w is added to emphasize its dependence on the i-th 
response curve. ' . 

Comparison between the influence curve of estimators for 
different values of T are useful. The usual notion of an influ- 
ence fui^ction a curve in x; however, since x takes only two 
values this notion is not useful for item response theory. ()n 
the othenhand, these influence functions are curves in T; 
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however, graphs of the influence function are not as easily 
manufactured as they are for the problems typically invks ti- 
gated by statisticians. The difficulty lies in the dependence 
of the denominator on the value of T. 

A few examples of the influence curves are plotted in Figure. 
1 as functions of F(T). The general form of the curve for h 
strictly greater than one, excluding the mle , approach zero asymp- 
totically as T approaches infinity. This indicates that outlying 
responses have less influence for very large or very small abilities. 
The behavior of the influence function for h equal to one, the mle, 
reveals that the largest influences are obtained for T approaching 
negative or positive infinity. 

Compared to the mle, the influence function of the 'h-es timator 
is redesceii jUng; that is, for either response, the influence starts 
at zero, rises and returns to zero. Hence, the property of influence 
functions of h-es timators are analogous to those of redescending M- 
estimators for the location problem in standard statistics. 

Conjectured Asymptotic Normality of h-estimators 

If F has' the limiting value F, then the one-term Taylor . 

n , , 

expansion entails 

T(F ) - T(F) + EIC(Z.; F,T) + R(F ,F), where R(F ,F) is v 
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the reaminder term depending on F and Fi___He--hnav^~~denoted 

— ' . ^ .. n 

7 'to be a random triplet with^d:±s4:.ribution F. We would expect, 

i * i ■ ■ 

for heuristic reasons, that n^R(F , F) converge^--to zero. Conse- 

n ^ 

1^ 

quently, n^[T(F ) - T(F)] would have the same limiting vaiiHr-o.f 
n 
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n^ElC(Z ;F,T). We state the consequences of these results 
i 

as a theorem. 

Theorem If the response and items are sampled so that are 

independent replicates from F and n^R(F ,F) converges to ^ero 

n 

\.- - - ■ 

then n^[T(F ) - T(F) ] has a limiting normal distribution with 
n 



mean 0 and variance 



;a2[x-P(T)]^P(T)Q(T)]^^^' ^^dF ^3^^^ 

" [d/a[x-P(t)]P(t)Q(t)h-ldF _]2 
dt 



t=T' 



In the following, this theorem will be proven with sufficient 

1^ 

conditions weaker than n^R(F ,F) converging to zero. 

n 




4. ' Asymptc\tic Properties 

We will describe a probability structure that conveniently 
yields the asymptotic behavior of the ability estimators in ^ 
item response theory. Let Z be the set of all triples z=(x,P,a) 
as x=0 or 1 , P ranges through a finite set P of non-decreasing 
maps from the entire real line to the closed unit interval, and 
a ranges through a positive finite set A . The set Z, imbued with 

the discret^ topology on its power set, is the sample space. 

/ 

Distributions over Z are defined as follows. If {f(z), zeZ} is 

a set of positive weights, summing to one, for a subset B, the asso- 

elated distribution function F on Z is expressed as F(B)=E^^^f (B) 

Let z , ...,z 'be a random sample from F, then the empirical diatri- 
1 n , 

bution function is F (3)=I . , d(z.,B)/u, where, d (z ,B) =1 if zeB and 

. n 1=1 ,n 1 

0 otherwise. F^ enjoys several properties following from the Stror/g 
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Law of Largr Numbers and the Central Limit Theorem: a) for any real 

valued function g on Z, /g(z)dF^(z) converges to /g(z)dF(z) wpl, and 

b) if /g (z)dF is finite, n^ [/g(z)dF (z)- /g(z)dF(z)] is asymptot- 

_ 2 

ically normal with mean zero and variance /[g(z)"-g] dF(z) where 

8 = /g(z)dF(z). The latter integration is the expectation of g with 

respect to F denoted by E^g. In the foregoing we consider 

g(z)=g(z;t)=a[x-P(t)][P(t)[l--P(t)]]^^^. - 

Define m^Ct) = E^g(Z;t). We assume throughout that Z^,,..,Z^ is 

a random sample from F with empirical distribution function F . 

n 

Theorem 1 (Consistency) Let t^ be an isolated root of m^(t) = 0. 

Suppose that P(t) is continuous in t for each P in P. If T is a 

n 

solution sequence to the empirical equation m (T ) =0, then T 

F n n 
n 

converges to t wpl. 

Proof of Theorem 1 Since P(t) is continuous, m^Ct) is continuous in t. , 

Therefore, for each e sufficiently small ^(t^- e) and m^Ct + e) are 

opposite in sign. Without loss of generality assume m^Ct - e)<0< ^(t + e) 

Since g(Z;t) is bounded in z for each t, the Strong Law of Large 

Numbers implies m^ (t) m^Ct) wpl. Hence " ' . 

n 

lim n -Kx) pfm^ (t^~ e) < 0 < m^ (t + e), for all m>n} = 1. 

n n 
But this implies 

lim n P{t -e < T < t +e, for all m >n} = 1. 

o mo / 

The proof is complete. 
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Rem ark 1 Tlieorem 1 is valid even for a model of item responses 
that is/different from the assumed model. The simplest way to 
see this is to allow F=F'^ where induces a random variable in 
P such that E^. (X|P>^)^P'^ 

Remark 2 Even if t does not correspond to T, the true ability, the 

solution sequence T will converge t , and t can not correspond to 

n o o 

T unless mp.^(T)=0. 

The final remark brings us to our next theorem about the 

magnitude of the difference between the true ability T and the" limit 

of a sequence of estimators under an alternative model. Let F* be 

a distribution over Z induced by a mapping from P into P denoted by 

P*. The item response curve alternative to P is P'^(P) and its value 

at t is denoted by P'UP)(t:). Define m>v (t) = E a [X-P'-^P) ( t) ] [P ( t)Q (t) ] 

r r 

Define the true ability to be a solution to m* (t)=0. 

r 

Theorem 2 (Asymptotic Bias) Let t^ be a solution to m^Ct) = 0 and let 

T be a solution to m*_(t) = 0. Then 

F i 

T-t^ = mp(T)/[dn^(t:)/dt:| ^^^\ ■ ^/ 

where ! t:j_-t:^ \ < |T-t:^ | . . j 

Proof The Mean Value Theorem implies mp(T)^5ap(t^) = 1 

(T-t ) dm (t)/dt where ■ "'^ / 

° \ ■ / ■ i 

jt^~t ! < iT-t i. The theorem follows frotn the definition of t . . 
' 1 o ^ ' o ' o / 

The next corollary insures that any solution sequence converges ^ 

with probability one. This is a strong result that indicates how strdng 

the standard assumptions of item response theory are. / 

■ . • I 

■ I 
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Corollary 1 Suppose P(t) is continuous in t for each P in P. Assume 

that for each P in P, P is a cdf in t . If T is a solution to the 

n 

empirical equation m^ (T ) = 0, then T converges wpl, 

F n n 
n 

Proof We must show that m^^^Ct) = 0 always has a solution for any 
that is the limit of F^. Suppose that m^^(t) ^ 0 for any t. Then F* 
gives positive probability . to a set of P* and P such that P*(T)-P(t) ^ 0 
for all t. Since 0"< P^'(T) < 1, the last condition leads to a contradic- 
tion to the assumed continuity of each P, 

We now turn to the distributional properties of a solution sequence. 
Theorem 3 (Asymptotic Normality) Let t^ be an isolated root of ^(t) = 0. 

Suppose that dP(t)/dt is continuous in t uniformly in P, Let T be a 

n 

solution sequence of m^ (t) = 0 satisfying T ^t . Then T is asymptotically 

r n o n 

n 

normal with variance A(F, t ) given by 



o 



A(F,t^) = Eg^-(Z;t^)/[Es'(Z;t^)]^ 



Proof of Theorem 3 For notational convenience define u^(t) = g(Z. ;t) 
a^[X_.-P_^(t) ] [P^(t) (l-P^(t) ] . Since dP^(t)/dt is continuous, we 
may apply the Mean Value Theorem to obtain the expansion 



^.(T )-u.(t ) = (T -t ) du.(t)/dt 
X n 1 o no 1 



t=t 

n 



where I t -t I < I T -t I . Sirvce m,, (T )=0, we have 
'no''no' Z F n 



n'' (T -t ) = 

^ ° .„-l-ld u,(t) 



t=t 

n 



The Central Limit Theorem implies that the numerator is asymp- 

2 

totically normal with mean 0 and variance E^g (Z,t^). The Strong 

Law of Large Numbers, the hypothesis that dP(t)/dt is uniformly 

continuous, and T -> t imply that the denominator converges to 
no 

Epg"(Z;t ) wpl. Ttie proof follows from Slutsky^s Lemma, 

We will use these theorems to compare estimators in the 
next section. ' 
Efficie ncy Comparisons 

Recall that: when h=l, the h-estimator corresponds to a certain 
mle associated with a two-parameter logistic model heaving 
discrimination parameters equal to the corresponding values of a_^ 
that appear in (2.1). Consequently, it is easy to determine the 
asymptotic efficiency of the h-estimator relative to this mle. 

The first comparison is made under the associated two-parameter 
logistic model and the second comparisons are made under a neighbor- 
ing two-parameter logistic model. The first model corresponds to 
the one appearing in Example 1. The sampling scheme consists of 
choosing from the designated items with equal probability. 

Table 2 displays the asymptotic relative efficiencies when the 
true corresponds to the assumed model. For the computed values of 
h, the h-estimators lose no more than 10 percent efficiency, or one in 
ten items is wasted. 

Table 3 displays the asymptotic relative efficiency when the true 
generates a P'^(t)=P(t-0.1) for each sampled item. This is equiva- 
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lent to uniformly shifting the difficulty parameters to the left 
by 0.1, or 6 percent of the total range of the difficulty parameters. 
In computing the efficiency we have used the approximation to the 
asymptotic bias given by Theorem 2 in order to compare the mean 
squared errors. The h-estimators outperform the mle for each 
computed value of h. 

\ 



\ 

\ 
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7. Tables 



^Table 1: Values of the h-estimator at chosen response 
sequences 



Responses 






h 


X 


l(MLE) 


1.5 


2 3 4 


1111100000 •• 


0,11 


0.11+ 


0.14 0.13 0.12 


1111100001 


0.58 


0.41 


0.22 0.20 0.19 



Table 2: Efficiency comparison of h-estimators to the 
MLE under the assumed model 



h 1.5 ^ 2.0 3.0 ^.0 5.0 

EFF .99 .98 .95 .92 .90 



Table 3: Efficiency comparison of h-estimators to the 
KLE under a neighboring model 

h 1.5 2.0 3.0 4.0 5.0 

EPF 2.24 4.83 19.15 51,46 82.62 
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